Search CORE

15 research outputs found

Recommended from our members

Prediction of microbial communities for urban metagenomics using neural network approach.

Author: Jiang Jyun-Yu
Ju Chelsea J-T
Wang Wei
Zhou Guangyu
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

BACKGROUND:Microbes are greatly associated with human health and disease, especially in densely populated cities. It is essential to understand the microbial ecosystem in an urban environment for cities to monitor the transmission of infectious diseases and detect potentially urgent threats. To achieve this goal, the DNA sample collection and analysis have been conducted at subway stations in major cities. However, city-scale sampling with the fine-grained geo-spatial resolution is expensive and laborious. In this paper, we introduce MetaMLAnn, a neural network based approach to infer microbial communities at unsampled locations given information reflecting different factors, including subway line networks, sampling material types, and microbial composition patterns. RESULTS:We evaluate the effectiveness of MetaMLAnn based on the public metagenomics dataset collected from multiple locations in the New York and Boston subway systems. The experimental results suggest that MetaMLAnn consistently performs better than other five conventional classifiers under different taxonomic ranks. At genus level, MetaMLAnn can achieve F1 scores of 0.63 and 0.72 on the New York and the Boston datasets, respectively. CONCLUSIONS:By exploiting heterogeneous features, MetaMLAnn captures the hidden interactions between microbial compositions and the urban environment, which enables precise predictions of microbial communities at unmeasured locations

eScholarship - University of California

Adversarial Reweighting for Speaker Verification Fairness

Author: Chen Zeya
Droppo Jasha
Jin Minho
Ju Chelsea J. -T.
Liu Yi-Chieh
Stolcke Andreas
Publication venue
Publication date: 15/07/2022
Field of study

We address performance fairness for speaker verification using the adversarial reweighting (ARW) method. ARW is reformulated for speaker verification with metric learning, and shown to improve results across different subgroups of gender and nationality, without requiring annotation of subgroups in the training data. An adversarial network learns a weight for each training sample in the batch so that the main learner is forced to focus on poorly performing instances. Using a min-max optimization algorithm, this method improves overall speaker verification fairness. We present three different ARWformulations: accumulated pairwise similarity, pseudo-labeling, and pairwise weighting, and measure their performance in terms of equal error rate (EER) on the VoxCeleb corpus. Results show that the pairwise weighting method can achieve 1.08% overall EER, 1.25% for male and 0.67% for female speakers, with relative EER reductions of 7.7%, 10.1% and 3.0%, respectively. For nationality subgroups, the proposed algorithm showed 1.04% EER for US speakers, 0.76% for UK speakers, and 1.22% for all others. The absolute EER gap between gender groups was reduced from 0.70% to 0.58%, while the standard deviation over nationality groups decreased from 0.21 to 0.19

arXiv.org e-Print Archive

The Time-domain Spectroscopic Survey: Target Selection for Repeat Spectroscopy

Author: Abazajian K. N.
Abolfathi B.
Adam D. Myers
Ahn C. P.
Aihara H.
Alam S.
Amy Lebleu
Anderson S. F.
Axel Schwope
Barlow T. A.
Barth A. J.
Bellm E. ed Wozniak P. R.
Berger E.
Blanton M. R.
Butler N. R.
Cackett E. M.
Carles Badenes
Catherine J. Grier
Chelsea L. MacLeod
Chris Z. Waters
Daniel Hoover
Davenport J. R. A.
Dawson K.
Dawson K. S.
Doi M.
Donald P. Schneider
Drake A. J.
Eisenstein D. J.
Eracleous M.
Eracleous M.
Eracleous M.
Eric Morganson
Eugene Magnier
Filiz Ak N.
Filiz Ak N.
Filiz Ak N.
Flewelling H. A.
Flohic H. M. L.
Frieman J. A.
Gaskell C. M. ed Swings J.-P.
Gezari S.
Gibson R. R.
Gizis J. E.
Green P.
Grier C. J.
Gunn J. E.
Gunn J. E.
Guo H.
Hall P. B.
Harris D. W.
Heber U.
Hee-Jong Seo
Hewett P. C.
Inserra C.
Isabelle Pâris
Ivezić Ž
J. G. Fernández-Trincado
Jenny Greene
Jeremy Tinker
Jessie Runnoe
John J. Ruan
Ju W.
Keivan G. Stassun
Kenneth Chambers
Korista K. T.
Kyle Dawson
LaMassa S. M.
Lewis K. T.
Liu J.
Liu X.
LSST Science Collaboration
Lundgren B. F.
Lundgren B. F.
Magnier E. A.
Maoz D.
Margala D.
Massaro F.
Matthew A. Bershady
Merloni A.
Michael Eracleous
Michael R. Blanton
Morgan D. P.
Morganson E.
Myers A. D.
Nick Kaiser
Nigel Metcalfe
Nurten Filiz Ak
Palaversa L.
Patrick B. Hall
Paul J. Green
Peterson B. M.
Prakash A.
Proga D.
R.-P. Kudritzki
Rachael Amaro
Reiners A.
Richards G. T.
Richards G. T.
Roig B.
Ruan J. J.
Ruan J. J.
Sarah J. Schmidt
Schlafly E. F.
Schmidt S. J.
Schmidt S. J.
Schneider D. P.
Schneider D. P.
Scott F. Anderson
SDSS Collaboration
Sean M. McGraw
Sergeev S. G.
Shen Y.
Shen Y.
Shen Y.
Smee S. A.
Sowinski L. G.
Storchi-Bergmann T.
Strateva I. V.
Stubbs C. W.
Sun M.
Suzanne L. Hawley
Tonry J. L.
Trump J. R.
Tsalmantza P.
Turnshek D. A. ed Blades J. C.
VanderPlas J. T.
Vivek Mariappan
Wang L.
Waters T.
West A. A.
West A. A.
White R. L.
William Nielsen Brandt
York D. G.
Yue Shen
Zhang N.-X.
Publication venue: 'American Astronomical Society'
Publication date
Field of study

Crossref

TahcoRoll: fast genomic signature profiling via thinned automaton and rolling hash.

Author: Ju Chelsea J-T,
Publication venue
Publication date: 03/07/2023
Field of study

Ezid

Efficient Approach to Correct Read Alignment for Pseudogene Abundance Estimates

Author: Chelsea J.-T. Ju
Wei Wang
Zhuangtian Zhao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Recommended from our members

Prediction of microbial communities for urban metagenomics using neural network approach.

Author: Jiang Jyun-Yu
Ju Chelsea J-T
Wang Wei
Zhou Guangyu
Publication venue: eScholarship, University of California
Publication date: 01/10/2019
Field of study

eScholarship - University of California

TahcoRoll: fast genomic signature profiling via thinned automaton and rolling hash

Author: Jiang Jyun-Yu
Ju Chelsea J-T
Li Ruirui
Li Zeyu
Wang Wei
Publication venue: eScholarship, University of California
Publication date: 01/12/2021
Field of study

ObjectivesGenomic signatures like k-mers have become one of the most prominent approaches to describe genomic data. As a result, myriad real-world applications, such as the construction of de Bruijn graphs in genome assembly, have been benefited by recognizing genomic signatures. In other words, an efficient approach of genomic signature profiling is an essential need for tackling high-throughput sequencing reads. However, most of the existing approaches only recognize fixed-size k-mers while many research studies have shown the importance of considering variable-length k-mers.MethodsIn this paper, we present a novel genomic signature profiling approach, TahcoRoll, by extending the Aho-Corasick algorithm (AC) for the task of profiling variable-length k-mers. We first group nucleotides into two clusters and represent each cluster with a bit. The rolling hash technique is further utilized to encode signatures and read patterns for efficient matching.ResultsIn extensive experiments, TahcoRoll significantly outperforms the most state-of-the-art k-mer counters and has the capability of processing reads across different sequencing platforms on a budget desktop computer.ConclusionsThe single-thread version of TahcoRoll is as efficient as the eight-thread version of the state-of-the-art, JellyFish, while the eight-thread TahcoRoll outperforms the eight-thread JellyFish by at least four times

PubMed Central

eScholarship - University of California

Recommended from our members

Widespread Allelic Heterogeneity in Complex Traits.

Author: Eskin Eleazar
Hormozdiari Farhad
Joo Jong Wha J
Ju Chelsea J-T
Kichaev Gleb
Pasaniuc Bogdan
Sankararaman Sriram
Segrè Ayellet V
Shifman Sagiv
Won Hyejung
Zhu Anthony
Publication venue: eScholarship, University of California
Publication date: 01/05/2017
Field of study

Recent successes in genome-wide association studies (GWASs) make it possible to address important questions about the genetic architecture of complex traits, such as allele frequency and effect size. One lesser-known aspect of complex traits is the extent of allelic heterogeneity (AH) arising from multiple causal variants at a locus. We developed a computational method to infer the probability of AH and applied it to three GWASs and four expression quantitative trait loci (eQTL) datasets. We identified a total of 4,152 loci with strong evidence of AH. The proportion of all loci with identified AH is 4%-23% in eQTLs, 35% in GWASs of high-density lipoprotein (HDL), and 23% in GWASs of schizophrenia. For eQTLs, we observed a strong correlation between sample size and the proportion of loci with AH (R2 = 0.85, p = 2.2 × 10-16), indicating that statistical power prevents identification of AH in other loci. Understanding the extent of AH may guide the development of new methods for fine mapping and association mapping of complex traits

eScholarship - University of California

Molecular events of apical bud formation in white spruce, Picea glauca

Author: Abrams Suzanne R.
Adams Eri
Allen Carmen C. G.
Cooke Janice E. K.
El Kayal Walid
Ju Chelsea J.-T.
King-Jones Susanne
Zaharia L. Irina
Publication venue
Publication date: 07/01/2011
Field of study

Bud formation is an adaptive trait that temperate forest trees have acquired to facilitate seasonal synchronization. We have characterized transcriptome-level changes that occur during bud formation of white spruce [Picea glauca (Moench) Voss], a primarily determinate species in which preformed stem units contained within the apical bud constitute most of next season's growth. Microarray analysis identified 4460 differentially expressed sequences in shoot tips during short day-induced bud formation. Cluster analysis revealed distinct temporal patterns of expression, and functional classification of genes in these clusters implied molecular processes that coincide with anatomical changes occurring in the developing bud. Comparing expression profiles in developing buds under long day and short day conditions identified possible photoperiod-responsive genes that may not be essential for bud development. Several genes putatively associated with hormone signalling were identified, and hormone quantification revealed distinct profiles for abscisic acid (ABA), cytokinins, auxin and their metabolites that can be related to morphological changes to the bud. Comparison of gene expression profiles during bud formation in different tissues revealed 108 genes that are differentially expressed only in developing buds and show greater transcript abundance in developing buds than other tissues. These findings provide a temporal roadmap of bud formation in white spruce.Peer reviewed: YesNRC publication: Ye

NRC Publications Archive